Similarity Measures and Weighted Fuzzy C-Mean Clustering Algorithm
نویسنده
چکیده
In this paper we study the fuzzy c-mean clustering algorithm combined with principal components method. Demonstratively analysis indicate that the new clustering method is well rather than some clustering algorithms. We also consider the validity of clustering method. Keywords—FCM algorithm, Principal Components Analysis, ClusI. FUZZY SET THEORETIC SIMILARITY MEASURES ONE of the most important issues in recommender systems research is computing similarity between users, and between items (products, events, services, etc.). This in turns highly depends on the appropriateness and reliability of the methods of representation. The set-theoretic, proximity-based and logic-based are the three classes of measures of similarity. In fuzzy set and possibility framework, similarity of users or items is computed based on the membership functions of the fuzzy sets associated with the users or item features. Based on the work of Cross and Sudkamp [1], those similarity measures [2] that are relevant to item recommendation application are adapted. For items Ij and Ik that are defined as {xi, μxi(Ij)), i = 1, 2, . . . , N} and {xi, μxi(Ik)), i = 1, 2, . . . , N}, a similarity measure between Ij and Ik is denoted by S(Ik, Ij), and the different similarity measures are defined as S1(Ik, Ij) = ∑ i min(μxi(Ik), μxi(Ij)) ∑ i max(μxi(Ik), μxi(Ij)) , (1) S2(Ik, Ij) = ∑ i μxi(Ik)μxi(Ij) √ ∑ i (μxi(Ik)) 2√∑ i (μxi(Ij))) 2 , (2) S3(Ik, Ij) = 1− d2(Ik, Ij) max i {μxi(Ik), μxi(Ij)} , (3) S4(Ik, Ij) = 1− 2 ZIk + ZIj d2(Ik, Ij) 2 , (4) where d2(Ik, Ij) = √
منابع مشابه
Bilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملارائه یک روش فازی-تکاملی برای تشخیص خطاهای نرمافزار
Software defects detection is one of the most important challenges of software development and it is the most prohibitive process in software development. The early detection of fault-prone modules helps software project managers to allocate the limited cost, time, and effort of developers for testing the defect-prone modules more intensively. In this paper, according to the importance of soft...
متن کاملSOME SIMILARITY MEASURES FOR PICTURE FUZZY SETS AND THEIR APPLICATIONS
In this work, we shall present some novel process to measure the similarity between picture fuzzy sets. Firstly, we adopt the concept of intuitionistic fuzzy sets, interval-valued intuitionistic fuzzy sets and picture fuzzy sets. Secondly, we develop some similarity measures between picture fuzzy sets, such as, cosine similarity measure, weighted cosine similarity measure, set-theoretic similar...
متن کاملWeighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering
Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...
متن کامل